Analyzing Transformers via Value Matrices

نویسندگان

چکیده

We propose a new method to analyze Transformer language models. In self-attention modules, attention weights are calculated from the query vectors and key vectors. Then, output obtained by taking weighted sum of value While existing works on analysis have focused weights, this work matrices. obtain joint matrices multiplying both matrices, show that trace correlated with word co-occurences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Singular value inequalities for positive semidefinite matrices

In this note‎, ‎we obtain some singular values inequalities for positive semidefinite matrices by using block matrix technique‎. ‎Our results are similar to some inequalities shown by Bhatia and Kittaneh in [Linear Algebra Appl‎. ‎308 (2000) 203-211] and [Linear Algebra Appl‎. ‎428 (2008) 2177-2191]‎.

متن کامل

Analyzing AHP-matrices by regression

In the analytic hierarchy process (AHP) the decision maker makes comparisons between pairs of attributes or alternatives. In real applications the comparisons are subject to judgmental errors. Many AHP-matrices reported in the literature are found to be such that the logarithm of the comparison ratio can be suÆciently well modeled by a normal distribution with a constant variance. On the basis ...

متن کامل

Analyzing Chromatin Using Tiled Binned Scatterplot Matrices

Background: Over the last years, more and more biological data became available. Besides the pure amount of new data, also its dimensionality—the number of di erent attributes per data point—increased. Recently, especially the amount of data on chromatin and its modifications increased considerably. In the field of epigenetics, appropriate visualization tools designed for highlighting the di er...

متن کامل

A NOTE VIA DIAGONALITY OF THE 2 × 2 BHATTACHARYYA MATRICES

In this paper, we consider characterizations based on the Bhattacharyya matrices. We characterize, under certain constraint, dis tributions such as normal, compound poisson and gamma via the diago nality of the 2 X 2 Bhattacharyya matrix.

متن کامل

Monad Transformers as Monoid Transformers

The incremental approach to modular monadic semantics constructs complex monads by using monad transformers to add computational features to a preexisting monad. A complication of this approach is that the operations associated to the pre-existing monad need to be lifted to the new monad. In a companion paper by Jaskelioff, the lifting problem has been addressed in the setting of system Fω. Her...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of The Japanese Society for Artificial Intelligence

سال: 2023

ISSN: ['1346-0714', '1346-8030']

DOI: https://doi.org/10.1527/tjsai.38-2_c-mb7